Non-independent Variables 

1. Partial differentiation witli non-independent variables. 

Up to now in calculating partial derivatives of functions like w = f{x, y) or w = f{x, y, z), 
we have assumed the variables x,y (or x,y,z) were independent. However in real-world 
applications this is frequently not so. Computing partial derivatives then becomes confusing, 
but it is better to face these complications now while you are still in a calculus course, 
than wait to be hit with them at the same time that you are struggling to cope with the 
thermodynamics or economics or whatever else is involved. 

For example, in thermodynamics, three variables that are associated with a contained 
gas are its 

p = pressure, v = volume, T = temperature, 

and you can express other thermodynamic variables like the internal energy U and entropy 

S in terms of p, v, and T. 

However, p, v, and T are not independent variables. If the gas is a so-called "ideal gas" , 
they are related by the equation 

(1) pv = nRT {n,R constants). 

To see what complications this produces, let's consider first a purely mathematical example. 

Example 1. Let w = x"^ + y"^ + z'^ , where z = x'^ + y'^. Calculate . 

ox 

Discussion. 

dw 

(a) If we think of x and y as the independent variables, then we can calculate — — by 

ox 

two different methods: 

(i) using z = x^ + y"^ to get rid of z, wc get 

w = x^ +y^ + {x^ +y^f 

= x'^+y'^+x^ + 2x^y'^ + y'^; 

— = 2x + 4x'^ + Axy^ 
ox 

(ii) or by using the chain rule, remembering 2; is a function of x and y, 

2 , 2 , 2 
w = X -\- y + z 

dw ^ ^ dz „ „ „ 

— = 2x-\-2z— = 2x-V2z-2x 
ox ox 

= 2x + 2{x^ +y^) ■2x, 

so the two methods agree. 

(b) On the other hand, if we think of x and z as the independent variables, using say 
method (i) above, we get rid of y by using the relation y^ = z — x'^, and get 



w = x'^ +y'^ +z'^ = x'^ + {z-x'^)+ z^ 
z + z^- 



dw p 
dx 
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These answers arc genuinely different — we cannot convert one into the other by using 
the relation z = x'^ + y'^. Will the right dw/dx please stand up? 

The answer is, that there is no one right answer, because the problem was not well-stated. 
When the variables are not independent, an expression like dw/dx has no definite meaning. 

To see why this is so, we interpret the above example geometrically. Saying that x, y, z 
satisfy the relation z = x"^ + y"^ means that the point {x, y, z) lies on the paraboloid surface 
formed by rotating z = about the ^r-axis. The function 

z j 

2,2,2 ^\ /' ~/l ^ 

w = x + y + z \ / /I 

pt / 

measures the square of the distance from the origin. To be defi- \__^^^ y 

nite, let's suppose we are at the starting point P = Pq : (1,0, 1) /"^ 
indicated, and we want to calculate dw/dx at this point. V 

Case (a) If we take x and y to be the independent variables, then to find 
dw/dx, we hold y fixed and let x vary. So P moves in the xz-plane towards A, 
along the path shown. 

As P moves along this path, evidently w, the square of its distance from the 
origin, is steadily increasing: — > and in fact the calculations for (a) on 

dw 

the previous page show that — — = 6. 

dx 

Case (b) If we take x and z to be the independent variables, then to find 
dw/dx, we hold z fixed and let x vary. Now P moves in the plane z = 1, along 
the circular path towards B. 

As P moves on this path, the square of its distance from the origin is not 
dw 

changing, and therefore — = 0, as we calculated in (b) before. 

To sum up, the vahic of dw/dx depends on which variables we take to be independent, 
because we are measuring different rates of change, as P moves along different paths. 

There is only one way out of our difficulty. When we ask for dw/dx, we must at the same 
time specify which variables are to be taken as the independent ones. This is done by using 
the following notation: 



Case (a): x,y are the independent variables: 



Case (b): x, z are the independent variables: 



dw 
dx 

dw 
dx 



These are read, "the partial of w with respect to x, with y (resp. z) held constant". 

Note how in each case the two lower letters give you the two independent variables. If we 
had more variables, we would use a similar notation. For instance if 

(2) w = f{x,y,z,t), where xy = zt, 



then only three of the variables x, y, z, t can be independent; the fourth is then determined 
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by the equation on the right of (2). Thus we would write expressions like 

1 I "partial of w with respect to x; y and t held constant"; 

Ow \ 

— — I "partial of w with respect to y; x and z held constant"; 

in the first, x, y, t are the independent variables; in the second, x, y, z are independent. 

2. Differentials vs. Chain Rule 

An alternative way of calculating partial derivatives uses total differentials. We illustrate 
with an example, doing it first with the chain rule, then repeating it using differentials. By 
definition, the differential of a function of several variables, such bs w = f{x, y, z) is 

(3) dw = f^dx + fydy + f^dz, 

where the three partial derivatives f^, fy, fz are the formal partial derivatives, i.e., the 
derivatives calculated as if a;, y, 2; were independent. 

Example 2. Find ( ^ ) , where w = x'^y — z'^t and xy = zt. 

\ J T,t 

Solution 1. Using the chain rule and the two equations in the problem, we have 
= x^ -2zt[ — \ = x^ -2zt- = x^ - 2zx. 



Solution 2. We take the differentials of both sides of the two equations in the problem: 

(4) dw = 3x^y dx + x^dy — 2ztdz — z^dt, ydx + xdy = zdt + tdz. 

Since the problem indicates that x, y, t are the independent variables, we eliminate dz from 
the equations in (4) by multiplying the second equation by 2z, adding it to the first, then 
grouping the terms, which gives 

dw = {Sx'^y - 2zy) dx + {x^ - 2zx) dy + z^dt 

Comparing this with (3) — after replacing z by i in (3) — we see that 

dw\ „ 9 „ / dw\ „ / dw\ 9 

(The actual partial derivatives are the same as the formal partial derivatives Wx,Wy,Wt 
because x,y,t are independent variables.) 

Notice that the differential method here takes a bit more calculation, but gives us three 
derivatives, not just one; this is fine if you want all three, but a little wasteful if you don't. 
The main thing to keep in mind for the method is that differentials are treated like vectors, 
with the dx, dy, dz, . . . playing the role of i , j , k , . . . . That is: 
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Dl. Differentials can he added, subtracted, and multiplied by scalar functions; 

D2. If the variables x,y, . . . are independent, two differentials are equal if and only if 
their corresponding coefficients are equal: 

(5) Adx + Bdy + . . . = Aidx + Bidy + . . . A = Ai, B = Bi, . . . ; 

D3. One differential can be substituted into another. 
Remarks. 

1. In Example 2, Solution 2, we used the operations in Dl to do the calculations. We 
used D2 in the last step, taking advantage of the fact that the x, y, t were independent. 

We could have done the calculations using D3 instead, by solving the second equation in 
(4) for dz and substituting it into the first equation. D3 is a consequence of the chain rule. 
Illustrations of its use will be given in the next section. 

2. The main advantage of calculating with differentials is that one need not take into 
account whether the variables are dependent or not, or which variables depend on which 
others; the method does this automatically for you. Examples will illustrate. 

3. If the variables are not independent, D2 is emphatically not true; the second equation 
in (4) gives a counterexample. 

Note also that in Dl, there is no attempt to include a "multiplication" or "division" of 
differentials to the list of operations. If u and v are functions of several variables, then 
their "product" dudv makes no sense as a differential, nor does their "quotient" du/dv, 
which dcispitc appearances is not in general related to any derivative, or function, or even 
defined. (There is no elementary analogue of the dot and cross product of vectors, though in 
advanced differential geometry courses a certain type of product for differentials is defined 
and used for multiple integration.) 

Example 3. Let w = x^ — yz + t^, where x,y,z,t satisfy the two equations 

z^ = X + y^ and xy = zt. 
Using these equations, we can express first z and then t in terms of x and y; this means 
that w can also be expressed in terms of x and y. Without actually calculating w{x, y) 
explicitly, find its gradient vector Ww{x,y). 

Solution. Since we need both partial derivatives {dw/dx)y and {dw/dy)x, it makes 
sense to use the differential method. Taking the differential of w and of the two equations 
connecting the variables gives us 

(6) dw = 2x dx — z dy — y dz + 2t dt, xdy + ydx = zdt + t dz, 2z dz = dx + 2y dy. 

We want x and y to be the independent variables; using the operations in Dl, first eliminate 
dt by solving for it in the second equation, and substituting for it into the first equation; then 
eliminate dz by solving for it in the last equation and substituting into the first equation; 
the result is 

fr, y 2ty t^\ , ( «2 2xt 2t^y\ , 

Since x and y are independent, comparing the two expressions for dw in (7) and (3) (using 
X and y), and then using D2, shows that the two coefficients in (7) are respectively the two 
partial derivatives and Wy, i.e., the two components of the gradient Vw. 
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Example 4. Suppose the variables x,y,z satisfy an equation g{x,y,z) = 0. Assume 
the point P : (1, 1, 1) lies on the surface g = and that {yg)p = (—1, 1, 2). 

Let f{x, y, z) be another function, and assume that (V/) p = (1, 2, 1). 

Find the gradient of the function w = f{x,y,z{x,y)) of the two independent variables x 
and y, at the point a; = 1, y = 1. 

Solution. Using differentials, we have, by (3) and our hypotheses, 

{dw)p = dx + 2dy + dz; (dg) p = —dx + dy + 2dz = 0, since dg = for all x, y, z; 

eliminating dz by solving the second equation for it and substituting into the first, or by 
dividing the second equation by 2 and substracting it from the first, we get 



{dw)p = ^dx + ^dy; 



(Vu;)p = |i + |j. 
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